A bit of work history

Early(ish) speech synthesis

Upon my arrival as an exchange student at Joensuun korkeakoulu (soon to become Joensuun yliopisto/University of Joensuu) in 1983, I was overwhelmed – in a very positive way – by the advanced equipment of their Phonetics Lab. Not only did I find a veritable Kay Sonagraph (watch Takayuka Arai’s demo to see how such a spectrograph worked), a state-of-the-art microcomputer setup for acoustic analysis (a dual floppy drive Motorola Exorset running Matti Karjalainen’s SPS-01), a sine-wave generator, an oscilloscope, as well as analog intensity and fundamental frequency meters, but also the famous OVE III formant synthesizer in all its glory:

Fonema's OVE III (photo source unknown)

This was worlds apart from the laboratory situation at Freie Universität Berlin’s linguistics department where you could barely make analog recordings…

The OVE synthesizer was not hooked up to the Exorset, though, because there was no software to run it but instead connected to an electronic typewriter which could directly transfer the various parameter names and values – a huge improvement over manual control where you had to flip switches (four for the parameter selection and six for the paramater value, see the ADDRESS and DATA areas of the OVE’s front panel) to input binary numbers, one bit at a time.

I had only done some very simple FORTRAN programming at the time (thanks to courses offered by an FU education scientist clearly ahead of his time, Hartmut Warlich) and the Exorset only came with a rather limited BASIC compiler in addition to its assembler. So it was time to learn Assembly language which made it possible to bypass the typewriter and instead prepare tables of parameter value changes over time on the Exorset, enabling the OVE to produce rudimentary connected speech. (My General Phonetics students of the late 1980s then had to suffer through making this setup generate short words; typically, it took them close to an hour to come up with something that sounded remotely like /auto/, for example.)

Networking and email infancy

Working on my MA thesis in 1984, I had the opportunity to run fundamental frequency extraction software at Tampere University of Technology. It was very impressive to see that after having done the speech recordings’ digitization in Tampere (from a Swiss NAGRA-IV tape recorder to a custom-built 16-bit analog-digital converter) I didn’t have to go back there to work with the f0 curves but could instead simply log on to the Tampere mainframe via a Textronix graphics terminal in Joensuu University’s Computing Center, connected via DECnet (at the time, and almost into the 1990s, all Finnish universities had Digital Equipment Corporation’s VAX/VMS mainframe systems and they all were interconnected with DECnet).

But it was at least as breathtaking to realize that you could send messages to anyone else in Finnish academia from your VAX terminal (and even in the Humanities everybody had their own terminal already in the second half of the 1980s); these messages were called emails.
And it did not take long before the Finnish universities got connected to international networks — not only other DECnets, but large multinational email networks like EARN, and on to BITNET. In the late eighties, we already could send email basically anywhere, with the help of gateways to the networks using the internet protocol that later became the global networking standard and made possible what we call the Internet today. At the time, though, if you wanted to send an email to a BITNET-external recipient you had to prepare it painstakingly with a prescribed header format and exclusively 80 character long lines, so every shorter line had to be padded with the correct number of trailing spaces. Not a big problem — much more frustrating was the fact that most of Europe lagged behind: while I could reach colleagues in the US or in Sweden Central Europeans had not yet adopted the concept of individual email addresses for every mainframe user, despite being connected to the same physical networks. (For some typical examples of historic email woes, have a look at Comments of a So-so Happy Crossnet User in NetMonth 9/2, 1988 – the whole journal issue makes for strangely entertaining reading, from today’s perspective.)

Signalyze

The same NetMonth issue also carries an article, also by Eric Keller, about the project for an online phonetics newsbulletin called foNETiks; this is how I got to know Eric who also was working on a much bigger project, his Signalyze_ software for speech analysis.

A waveform window in Signalyze 3.0

At a time when the de-facto standard tool for digital measurements of speech acoustics, Entropic’s ESPS/Waves+ (parts of which live on in KTH’s Wavesurfer), was prohibitively expensive for many smaller phonetics labs, this $400 Macintosh program was a very attractive option for many speech scientists — until Praat appeared.
I was marginally involved with some Signalyze programming, mainly its FileConverter utility; it was intriguing to get to know some of the Macintosh’s internals, like e.g. its peculiar file system where every file consisted of two forks, resource and data.

Speech Examination

Acoustic analysis of impaired speech was still in its infancy in the 1990s, and there were hardly any systematic analysis/evaluation protocols shared by a larger community. Eric Keller’s Speech Examination system tried to fill this void.

From the Speech Examination Manual

Detailed instructions on how to elicit speech, how to record and digitize it, and how to systematically evaluate it for non-typical traits were published for different types of data, from held vowels through diadochokinesis to stretches of connected spontaneous speech, including language-specific stimulus lists for English, French, German, and Finnish.

In hindsight, the Speech Examination seems to have been simultaneously ahead of, and behind its times: the need for and the potential of a tool like this were not immediately recognized by many clinical researchers, and the implicit link to commercial Signalyze might have made it suspect to other speech researchers, ready to embark on open-source projects like Praat. So perhaps it was just a question of unfortuitious timing – in any case, the Speech Examination protocol never got as widely employed as it deserved. And today, of course, automatic machine learning algorithms do not leave too much space for the sensible use of manual acoustic analysis in this area, in general.

More speech synthesis

COST-258

Suopuhe

KIT network

Around the same time Kimmo Koskenniemi also initiated a national network for teaching language (incl. speech) technology. For about half a decade it hugely inspired – and financially helped – interested groups and individuals at various Finnish universities.

Free and open software

European Master’s in Clinical Linguistics

The EMCL is an EU-funded interdisciplinary Master’s program that started out at the beginning of the new millennium as a cooperation of the universities of Potsdam, Groningen and Milan-Bicocca, with the University of Joensuu soon joining them. Over the years it has become the focus of linguistics teaching at Joensuu (since 2010 University of Eastern Finland), with its international students markedly outnumbering local linguistics students. To me, it has long represented (and still does) a rather ideal example of successfully applying the vision of a modern, connected Europe to practical cooperation.

Information on EMCL’s latest incarnation as a joint project of Groningen, Ghent and Joensuu can be found at https://emcl.eu/.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search